Search CORE

1 research outputs found

Distributed Supervised Statistical Learning

Author: khalili Mahmoudabadi Amir
Publication venue: 'Brock University Library'
Publication date: 13/09/2023
Field of study

We live in the era of big data, nowadays, many companies face data of massive size that, in most cases, cannot be stored and processed on a single computer. Often such data has to be distributed over multiple computers which then makes the storage, pre-processing, and data analysis possible in practice. In the age of big data, distributed learning has gained popularity as a method to manage enormous datasets. In this thesis, we focus on distributed supervised statistical learning where sparse linear regression analysis is performed in a distributed framework. These methods are frequently applied in a variety of disciplines tackling large scale datasets analysis, including engineering, economics, and finance. In distributed learning, one key question is, for example, how to efficiently aggregate multiple estimators that are obtained based on data subsets stored on multiple computers. We investigate recent studies on distributed statistical inferences. There have been many efforts to propose efficient ways of aggregating local estimates, most popular methods are discussed in this thesis. Recently, an important question about the number of machines to deploy is addressed for several estimation methods, notable answers to the question are reviewed in this literature. We have considered a specific class of Liu-type shrinkage estimation methods for distributed statistical inference. We also conduct a Monte Carlo simulation study to assess performance of the Liu-type shrinkage estimation methods in a distributed framework

Brock University Digital Repository